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ABSTRACT 


Accurate prediction of officer loss behavior is essential for the planning of 
personnel policies and executing the U.S. Army’s Officer Personnel Management System 
(OPMS). Inaccurate predictions of officer strength affect the number of personnel 
authorizations, the Army’s budget, and the necessary number of accessions. Imbalances 


of officer strength in the basic branches affect the Army’s combat readiness as a whole. 


Captains and majors comprise a critical management population in the United 
States Army’s officer corps. This thesis analyzes U.S. Army officer loss rates for 
captains and majors and evaluates the fit of several time series models. The results from 
this thesis validate the time series forecasting technique currently used by the Army G-1, 


Winters-method additive. 
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EXECUTIVE SUMMARY 


Accurate prediction of officer loss behavior is essential for the planning of 
personnel policies and executing the U.S. Army’s Officer Personnel Management System 
(OPMS). Inaccurate predictions of officer strength affect the number of personnel 
authorizations, the Army’s budget, and the necessary number of accessions. Imbalances 
of officer strength in the basic branches degrade the Army’s combat readiness as a whole. 
The objective of this thesis is to conduct a time series analysis of U.S. Army officer loss 
rates for captains and majors and identify a time series model that accurately predicts the 
expected number of commissioned officer losses for each basic branch by grade (captain 


and major). 


Individual loss and gain records from October 1998 thru September 2004, 
obtained from the Total Army Personnel Database-Active Officer (TAPD-AO), were 
aggregated by grade and basic branch. The aggregated data form a time-series of net 
losses. Time-series for each grade (O-3 and O-4) and basic branch were analyzed using 


SAS Time-Series Forecasting System (TSFS). 


Ten time-series models, determined to be appropriate for the data, were fit to the 
data using SAS TSFS. Akaike’s Information Criterion was used to evaluate the fit of 
each of the ten models. Two models, seasonal exponential smoothing and Winters 
method-additive, distinguished themselves from the others. These two models had the 


best fits in every series. 


Winters method-additive, the current forecasting technique used by the Strength 
Analysis and Forecasting Branch, Army G-1l, is validated. Although seasonal 
exponential smoothing is less complex, having one less parameter, the increase in fit as 


measured by AIC is negligible. 


However, these best fitting models have weak predicting power. Predictions from 
the seasonal exponential smoothing model for 2004 were compared to the corresponding 
observed values in our test set. The observed and predicted values for captains have a 
correlation of .21; for majors the correlation is .52. 


Xlil 


A comparison of the results of multiple regression and time-series is worth 
investigating. Such a study would require the collection of external monthly econometric 
variables such as gross domestic product, unemployment rate, durable good orders, and 
so on. Multiple regression may achieve better fitting models than the time-series shown 


here. 


XIV 


I. INTRODUCTION 


A. BACKGROUND 

The Army is currently transforming its structure and moving toward a modular 
force. The Army Chief of Staff, General Peter Schoomaker, stated this very clearly in a 
July 2004 Defense Department Special Briefing on U.S. Army transformation when he 
said: 

We are changing our Army along three primary avenues — and this is 

important, I believe, as we talk about this the rest of the afternoon, the 

time that we have together, to think in terms of the context of what we're 

doing. The first is that we are restructuring the force into modular 

formations. And we're calling these the combat forces, brigade combat 

team[s], units of action. And this [is] a path on the transformation towards 


the eventual Future Combat System — units of action. (Defense 
Department, 2004) 


In conjunction with this structure change, the Deputy Chief of Staff, Army G-1 is 
currently involved in changing the way officers are managed in the promotion and career 
field designation (CFD) process. These changes require a forward-looking ability in 


order to predict what each branch or career field will look like in the future. 


The Army is redesigning the Officer Personnel Management System (OPMS). In 
past years the Army has conducted a functional area (FA) designation between the 5" and 
6" year of an officer’s service. This timing was logical. It allowed an officer to complete 
a company command in his or her basic branch prior to FA designation and then alternate 
between basic branch and FA assignments thereafter. In the past, all officers received 
their FA designation from a preference-based board. However, the reality was that very 
few actually served in a FA position as a captain. At the time of this writing (April, 
2005), CFD occurs at the ten-year time in service point. An officer retains and will work 
in this career field for the duration of his or her career. The result is often officers whose 
FA designations do not align with their CFD and hence do not support Army 


requirements. 


As a result, Officer Personnel Management Division (OPMD) directed that 


starting with Cohort Year Group 1999, officers no longer go before a FA designation 
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board or be designated a second career field (U.S. Army Officer Professional, 2004). FA 
proponents now review the entire year group and are able to identify, recruit, and select 


officers to serve a FA assignment. 


The current system designates an officer into a career field after selection to 
major, which occurs at about the 10-year point. After career field designation the officer 
remains in that branch or career field full-time. Since many career fields invest a lot of 
time (up to three years of training) and money in qualifying officers, the Army is moving 
toward early designation of a limited number of officers starting at the seven-year mark. 
To decide which branches these early career-field designated officers will come from and 
to which branches they will go, the Army must be able to accurately predict the number 


of officers expected to be in each branch at the ten-year point. 


For promotion and career field designation purposes, the Career Systems Analysis 
and Studies Branch looks at the strength of the entire rank into which the board will 
promote officers. The predicted number of promotions from the primary zone, above the 
zone and below the zone is added to the current strength to determine post-board strength 
for each branch or career field. This expected strength is then compared to the force 
structure requirements of the promotable rank to determine the number of promotions or 


career field designations needed for each branch or career field. 


Inaccurate predictions of officer strength affect the number of personnel 
authorizations, the Army’s budget, and the necessary number of accessions and losses. 
Imbalances of branches and career fields affect the Army’s combat readiness as a whole. 
The results from this thesis will help the Army G-1 to assess current force structure and 
readiness, determine loss and accession policies, and contribute to the design of the future 


force structure of the Army. 


B. THESIS OBJECTIVE 

1. Objective 

Captains and majors comprise a critical management population. In this thesis we 
conduct a time series analysis of U.S. Army officer loss rates for captains and majors and 


identify a time series model that accurately predicts the expected number of 
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commissioned officer net losses for each basic branch by grade (O-3 and O-4). The 


following tasks were performed pursuant to this objective: 


a. Monthly historical data containing individual loss, gain, and 
promotion records was constructed from queries into the Total Army Personnel Data 
Base-Active Officer (TAPDB-AO). The first five years of data was used as training data 
to identify the best models. The last year of data was used as test data. The test data was 


quarantined and used later to evaluate the best model. 


b. The current forecasting technique, Winters Additive, was included 
to establish a baseline for comparison of the other techniques considered and to gain 


insight into the techniques’ accuracy. 
C Other models were developed. 


d. Measures of accuracy were developed and used to evaluate each 


predictive technique. 


é: A comparative analysis of each forecasting technique was 


conducted to identify the model that provided the most accuracy. 


£ Forecasts from the best model were compared against observed 


values in the test set to evaluate the models predictive power. 


2. Organization 
This introductory chapter provides the reader with a description of the problem 
and the organization of the thesis. It also provides the motivation for conducting this 


research. 


Chapter II contains a description of the TAPDB-AO data provided by the Army 
G-1, Deputy Chief of Staff, Career Systems Analysis and Studies Branch. It also 


describes problems with the data and how these problems were resolved. 


Chapter III contains the details of how the analysis was conducted. It first 
describes how the data was sorted for analysis. Secondly, it describes each time series 
model considered in the analysis that was used to forecast expected loss rates. Finally, 
the chapter describes the goodness of fit measures used to compare the models. 
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Chapter IV contains the analysis of results. The best fitting models, for each basic 
branch and grade (O-3 and O-4), are presented in table form. The best fitting models are 
evaluated by comparing their predicted values against observed values in our test set. 


This chapter concludes with a summary of results. 


Chapter V concludes the thesis. It contains an overall summary and conclusions. 


It also makes recommendations for future study. 


Ce LITERATURE REVIEW 
Regression analysis and time-series analysis are two scientific approaches to 
making attrition forecasts. A wealth of historical research is available concerning the 


attrition of military forces. Nearly all of this historical research uses regression analysis. 


In addition to making forecasts, regression analysis identifies variables that effect 
attrition. It is useful in identifying the characteristics of who is being lost. This 
information influences policy-makers who make decisions in an attempt to influence 


realized attrition. 


Yaffee (Yaffee, 2000) describes a time series as “a sequence of observations 
ordered by a time parameter.” The result of a time-series analysis is a just a forecast. No 


inference of the characteristics of who is being lost can be made. 


Rubiano (Rubiano, 1993) argued his use of regression stating “the desire to 
forecast.” Esmann (Esmann, 1984) cites simplicity as his reason for using regression. 
Time-series analysis could have been used for both studies. Time-series provides the 


forecast that Rubiano requires. It is also provides the simplicity that Esmann sought. 


The research question for both aforementioned studies deals with attrition rates, 
not characteristics. For this author, the choice of regression or time-series largely 
depends on what is being asked. If the question is just about attrition, as in this research, 
time-series is a good approach. If the question is broader, or the analyst expects 


questions about the characteristics of those lost, regression would be a better approach. 


This thesis develops several time series models for predicting officer loss rates by 
grade and control branch. Dewald (Dewald 1996) conducted a similar time-series 


analysis of U.S. Army enlisted loss rates. 


Although not specifically stated in his thesis, Dewald assumed that enlisted 
soldiers losses were homogeneous across basic branches. A key difference between this 
thesis and Dewald’s is that the officer population is not assumed to be homogeneous 
among basic branches. This is a significant difference since predictions about specific 


populations are often required. 


One of the models Dewald considered was the auto-regressive integrated moving 
average (ARIMA) model. An ARIMA model must be stationary. If the underlying series 
is not stationary the time series can be differenced to make it stationary. This is the ‘I’ in 


ARIMA. 


An ARMA (or ARIMA with I=0) would be valid if the underlying series is 
proven to be stationary. An examination of the correlation and partial correlation plots, 
generated from the series, is necessary to make a claim of stationarity. The stationarity 
condition and correlation plot properties were assumed by Dewald but will be examined 


in detail in this thesis. 


The models used in this study are limited in their ability to make predictions 
beyond one or two periods. Time series forecasts assume the conditions surrounding the 
forecast remain constant (Yaffee, 2000). Since time-series models make extrapolatory 


predictions, they should be used cautiously as a tool for making long-term forecasts. 
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Il. RESEARCH METHODOLOGY 


A. DATA VALIDATION 

The data provided by the Army G-1 for this thesis came from the Army’s Total 
Army Personnel Data Base-Active Officer (TAPDB-AO) database. It contains each gain, 
promotion and loss transaction that occurred between October 1998 and September 2004; 
a period of seventy-two months. Each individual record contains numerous variable 
fields; type of transaction (gain, promotion, loss), social security number, month and year 
of the transaction, officers basic and control branch, and information on the officer’s 


rank. 


Gain transactions record officers who just came onto active duty or returned after 
a break in service. Promotion transactions record officers’ promotions, including their 
current and previous rank. Loss transactions record the retirement, separation, or death of 
officers which translate into a reduction in total officer strength. A list of the ninety-two 


ways an officer can be classified as a loss is contained in appendix A. 


Chatfield (Chatfield, 2001) describes the process of data cleaning as examining 
the quality of the data and considering modifying the data to remove any obvious errors. 
The TAPDB-AO data provided by the G-1, which contained over 140,000 records, 
required extensive work in this area. A cursory look at the raw data reveals numerous 
instances of duplicate or repetitive loss records for the same transaction. A duplicate 
means that the same exact record exists in the data-set more than once. A repetitive 
record means that the same record exists in the data for one or more consecutive months. 
Observations of duplicate and repetitive transactions can be found in gain and promotion 
records as well. These initial observations had to be corrected before any analysis can 
begin. 

1. Identification of Database Errors 

A count of unique social security numbers revealed that only 69% are unique. 
One would expect some duplication of social security numbers to appear since the data 


spans six years. For example, a social security number for a second lieutenant which 


appears as a gain in October 1998 would be expected to appear with an associated first 
lieutenant promotion two years later in October 2000, and again with a promotion to 


captain in October 2002. 


As expected, duplications like the one described in the example above are present. 
However, there are also clearly erroneous instances of multiple records for the same 
promotion, loss or gain. For example, for a major who separates from the service in 
January 2000, there should be a corresponding single loss record in the January 2000 
data. Yet in many instances the data contains two or more loss records, or additional loss 
records in the months following. As a result, inclusion rules were developed to clean the 
data by screening out these obvious errors in the TAPDB-AO data. 

2. Development of Inclusion Rules 

Including the duplicate and repetitive records would lead to biased prediction 
results. The data had to be cleansed to eliminate any bias and accurately represent true 
realized losses. To this end, a set of data inclusion rules for gains, promotions and losses 
was developed to eliminate any duplicate or repetitive observation error. A summary of 
the data inclusion rules used to eliminate duplicate and repetitive record errors for gains, 


losses and promotions is outlined below. 


a. Gains 
e Only zero, one, or two gains are allowed per individual. 
e If an individual has two gain records, a loss record must seperate. 


If an individual has duplicate gain records in the same month and 
year, only the first gain record is valid. 


® An officer can be gained into any rank. 
b. Promotion 
e Only zero, one, or two promotions are allowed per individual. 


If an individual has a duplicate promotion record in the same 
month and year, only the first promotion record is valid. 

° If an individual has a second promotion record, it must follow the 
first promotion by at least thirteen months. 

An officer can only be promoted into the next higher rank. 


c. Losses 

e Only zero or one losses are allowed per individual. 

e If an individual has duplicate loss records (e.g. same month and 
year), only the first loss is valid. 

e If an individual has repetitive loss records (e.g. consecutive 


months), only the first one is valid. 
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B. DATA AGGREGATION 

The data provided by the Army G-1 consists of individual gain, promotion and 
loss records from October 1998 to September 2004. This means that the data had to be 
aggregated in such a way as to capture the net gain or loss by basic branch and grade for 
each month. Additionally, promotions had to be redefined since they result in a loss to 
the officer’s previous grade and a gain to the officer’s new grade. 

1. Promotions Redefined 

Two new variables, “promotion gain” and “promotion loss” were introduced to 
accommodate the effects of promotions on net losses. For example, an infantry officer 
who was promoted from captain to major would be simultaneously classified as an 
infantry captain loss (promotion loss) and an infantry major gain (promotion gain). 

2. Aggregation Procedure 

‘Loss’ and ‘gain’ variables were used in addition to the promotion variables 
described above. A loss is defined as an individual who separates from the Army by one 
of the ninety-two reasons listed in appendix A. A gain is defined as an individual who 


enters or returns to active duty from one of the eleven sources listed in Table 1. 

















SOURCE TYPE 

USMA USMA 
ROTC-SCHOLARSHIP ROTC 
ROTC ROTC 
OCS-DMG OCS 
OCS OCS 
NATIONAL GUARD STATE OCS — OTHER 
DIRECT APPOINTMENT OTHER 
USAFA OTHER 
USNA OTHER 
USMMA OTHER 
OTHER OTHER 


Table 1. Sources of officer gains 


Aggregating the data to capture net gains or losses by basic branch and grade for 
each month is a simple summation of variables. This summation is represented in 
Equation 2.1 where index i represents the month and index j represents basic branch. A 
negative result means there is a net gain in strength in month i for basic branch j and a 


positive result means there is a net loss. Applying this summation procedure to the data 
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results in a sixty-month time series table for each of the seventeen basic branches. The 
formula in Equation 2.1 is applied twice, once for captains and once for majors, to 
produce the respective aggregated loss tables. Applying the formula in Equation 2.1 to 
the cleansed data results in the aggregated loss tables similar to those displayed below in 
Table 2. The column headers indicate basic branches. A complete list of basic branch 
abbreviations is in Table 3. The complete net loss tables for both captains and majors are 


contained in Appendix B. 


Net Loss, = Loss, + PromLoss,,— Gain, —PromGain, Viel, jeJ (2.1) 
Date AD AG AR AV CM EN FA Fl IN MI 
10/1/1998 16 5 19 30 4 15 39 3 41 20 
11/1/1998 3 “1 -4 ? 6 ? 4 0 1 5 
12/1/1998 12 ? 26 16 5 19 27 4 30 40 
1/1/1999 13 9 16 23 ? 6 13 3 26 40 
2/1/1999 5 1 12 21 7 30 6 2 3 4 
3/1/1999 17 16 24 25 10 20 38 1 51 32 
4/1/1999 12 1 23 3? 11 26 41 8 58 53 
5/1/1999 8 "1 27 12 11 18 52 5 20 33 


Table 2. Partial Aggregated Loss Table 


AD Air Defense Artillery 

AG Adjutant General's Corps 
AR Armor 

AY Aviation 

CM Chemical Corps 

EN Corps of Engineers 

FA Field Artillery 

Fl Finance Corps 

IN Infantry 

Ml Military Intelligence Corps 
MP Military Police Corps 
MS Medical Service Corps 
OD Ordnance Corps 

QM Quartermaster Corps 

sc Signal Corps 

SF Special Forces 

TC Transportation Corps 


Table 3. Basic Branch Abbreviations 
The aggregated net-losses are separated into training data and test data-sets for 
both captains and majors. Training data is used to identify the best fitting model and 
contains the majority of the data (Oct 98-Sept 03). The remaining data (Oct 03-Sept 04) 


comprises the test set and is used to evaluate how well the best model predicts. 
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Hl. METHODOLOGY 


A. DATA OBSERVATIONS 

1. Initial Data Observations 

SAS, version 8.1, is the statistical software package currently in use by the Army 
G-1. We have chosen SAS to conduct this research since it will facilitate the G-1 in 


implementing the recommendations. 


Creating time-series graphs for each branch and grade is accomplished by writing 
code in the SAS editor window once the aggregated loss tables are created. Figure 1 
contains the SAS code used to create time series plots. Comments (between /* and */) 
are included to help explain the code. This SAS code generated the graph in Figure 2 for 
air defense captains. Similar code was used to produce the time-series loss graph for 


ordnance majors in Figure 3. 


data Captains; 

















INFILE 'C:\CaptNoNames.txt' DLM='09'X DSD MISSOVER; 

INPUT 

AD AG AR AV CM EN FA FI IN MI MP MS 
OD QM SC SF TC; 

date=intnx('month', '010CT1998'd,_ n -1); /*creation date variable*/ 

format date monyy5.; /*specifies format for date*/ 

symbol i=join c=blue; /*Use blue line for time series*/ 

axisl label=(a=90 'Losses'); /*Label Losses on vertical axis*/ 

PROC gplot; /*Plots losses for each branch*/ 





plot AD*date /vaxis=axisl; 
title justify=c 'AD Captain Losses'; 


run; 
QUIT; 


Figure 1. | SAS Code For Creation of Time-Series Plots 
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AD Captain Losses 


Losses 





SEPS8 JANSS MAYS9 SEPSS JANOO MAYOO SEPOO JANOI MAYO! SEPO! JANO2 MAYO2 SEPO2 JANOS MAYOS SEPO3 
date 


Figure 2. Time-Series Plot of Air Defense Captains’ Losses 


OD Major Losses 


Losses 











SEP34 JANIS MAYSS SEP3S JANOO MAYOO SEPOO JANO1 MAYO SEPO1 JANO2 MAYO2 SEPO2 JANOS MAYOS SEPO3 


date 


Figure 3. Time-Series Plot of Ordnance Majors’ Losses 
The time-series plots in appendix C do not lend themselves to intuitive 
interpretation. The only obvious feature of these graphs is five well-defined downward 


spikes common to most of the captain graphs. These spikes, indicating a net gain for the 
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respective branch, are explained by promotion policy. Time in service determines when 
second lieutenants get promoted to first lieutenant and later promoted to captain. This 
means that all second lieutenants commissioned in the same month will be promoted to 
both first lieutenant and later captain at the same time. Those spikes are therefore 


expected since most commissioning is done in the May and June timeframe. 


The majors’ time-series graphs do not contain well-defined spikes since 
promotions to major follows a different policy. Captains are promoted to major from a 
list. Army requirements determine how many captains from this list get promoted in a 
given month. Unlike promotion to captain, this requirements based promotion policy 
prevents large gain spikes from occurring. 

2. Detailed Data Observations 

Box and Jenkins (Box, 1976) explained stationary as being “characterized by an 
equilibrium around a constant mean level as well as a constant dispersion around that 
mean level”. In other words a series that has a fixed mean and constant variance is said 


to be stationary. 


Intuition does not provide much information for any of the time-series graphs in 
Appendix C. There is no obvious seasonality or trend in any of the plots. A student of 
time-series analysis might suggest that the plots appear roughly stationary. However, a 
formal test of stationarity is needed to confirm this belief. Stationarity is a necessary 
condition for many time-series models and must be tested for. Brocklebank 
(Brocklebank, 2003) suggests using the Dickey-Fuller Unit Root Test to accomplish this 
task. This test can be conducted in SAS, version 6.12 and newer, through the editor 
window or in the Time-Series Forecasting System. This test requires information about 
the auto-regressive properties of the series in question, so correlogram plots need to be 
generated before the Dickey-Fuller test can be conducted. 

a. Correlogram 

The correlogram, considered “one of the most useful tools in time-series 
analysis” (Chatfield, 2001), assesses time-series behavior. Graphing the correlogram of 
each time-series provides three key observations. First, it provides a picture from which 
the auto-regressive parameter, used in many time-series models and necessary to perform 


the Dickey-Fuller Unit Root Test for stationarity, can be estimated. Second, it provides a 
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picture from which the moving average parameter, common to many time-series models, 
can be estimated. Lastly, it provides graphical evidence useful in examining the 


assumption of a stationarity series. 


A SAS-generated correlogram provides autocorrelation and_partial- 
autocorrelation graphs. Figure 4 shows the autocorrelation graph and Figure 5 shows the 
partial autocorrelation graph for Military Police captains. The two dotted vertical lines to 
the right and left of zero, often referred to as the bounds of stability, represent two 


standard deviations from the series mean. 


The autoregressive parameter estimate is obtained from the autocorrelation 
graph. The autocorrelation in Figure 4 shows a time-series that attenuates immediately to 
within the bounds of stability. This provides three results. First, the autoregressive 
parameter is zero. This is the case when, “[a]part from the value at lag zero, which is 
always one and tells us nothing, the autocorrelations all lay inside the bounds of stability 
(Chatfield, 2001)”. Second, it provides evidence that the series is stationary, since 
autocorrelations for non-stationary series tend to attenuate slowly or even increase. 
Lastly, there is no seasonality present. Spikes in the autocorrelation graph would be 
present if seasonality existed: every three lags for quarterly seasonality or twelve for 


annual seasonality (Yaffee, 2000). 


MP CAPTAINS 
The ARIMA Procedure 
Name of Variable = MP 
Mean of Working Series - 0. 68333 


Autocorrelations 


Lag Covariance Correlation -198765432101234567891 
1 -32.227227 -. 13176 AES 
2 -9.182231 -. 03754 * 
3 0.577764 0.00236 
4 -15.980019 -. 06534 * 
5 -17. 156134 -. 07014 ss 
6 - 18. 862806 .07712 * 
7 19.821634 0.08104 x 
8 19.614130 0.08019 aN 
9 -42.059208 -. 17196 RES 
10 «55. 330509 0.22622 EERE 
11 -20.167551 -. 08246 hak 
12. 18.177167 0.07432 * 
13 -22.131449 -. 09049 He 
14. - 6.479509 -. 02649 . 
15-3. 650347 -. 01492 











"," marks two standard errors 
Figure 4. Autocorrelation Plot for Military Police Captains 
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The partial autocorrelation graph in Figure 5 assists in identifying the 
moving average parameter. This graph provides two useful results. First, the moving 
average parameter is zero since no lags are necessary to move inside unity. Second, it 
also supports the belief that the series is stationary since no significant spikes are present 


at any lag. 


The Dickey-Fuller Unit Root Test code in Appendix E generates 
correlograms for the other basic branches. An analysis of these correlograms supports 
similar findings with respect to autoregressive parameters, stationarity and moving 


average parameters for all basic branches. 


MP CAPTAINS 
The ARIMA Procedure 


Partial Autocorrel ations 














Lag Correlation -198765432101234567891 
1 -0,13176 aK 
2 - 0.05587 * 
3 -0,01044 
4 - 0.06997 * 
5 - 0.09147 * 
6 -0,11035 x 
7 0.04569 * 
8 0.08627 x 
9 -0, 16335 aK 
10 0.18124 kK 
11 -0,05464 * 
12 0.10133 x 
13 -0, 08269 x 
14 - 0.02623 * 
15 -0.04591 oF 
Figure 5. Partial Autocorrelation Plot for Military Police Captains 


b. Dickey-Fuller Unit Root Test 

All results thus far indicate that we are dealing only with stationary series. 
An additional approach, the Dickey-Fuller Unit Root Test, is produced by the SAS 
procedure PROC ARIMA. The complete code is in Appendix E. This code produced the 


Dickey-Fuller Unit Root Tests result for air defense majors shown in Figure 6. 
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AD Majors 


Dickey-Fuller Unit Root Tests 
Type Lags Rho Pr < Rho Tau Pr < Tau FooPr > F 
Zero Mean 0 -60.8952 <. 0001 
Single Mean 0 -60.8961 0.0005 
0 -64.7979 0.0001 


<.0001 
0.0001 30.39 0. 
Trend <. 0001 


Ty's 
a 
ic) 
c 
5 
oO 
a 
w) 


ickey-Fuller Stationary Test 

Notice that the /ag parameter, identified by observing the autocorrelation 
plot in Figure 6, is zero. In the code, this is accomplished by specifying adf=(0) where 0 
is the autoregressive parameter identified from the autocorrelation plot. Rho represents 
the coefficient of the lagged response variable. Tau tests whether the lagged term is 
significant (Yaffee, 2000). The F statistic tests intercept and mean conditions. The F 
statistic is ignored here since we are only interested in stationarity results and their 
significance which we determine from Rho and Tau respectively, (Brocklebank, 2003). 
A time-series with Rho and Tau probabilities less than the generally accepted .05 level of 


significance is considered stationary (Yaffee, 2000). 


The Dickey-Fuller Unit Root Test performs three types of tests 
simultaneously. The characteristics of the time-series in question will dictate which type 
of test is needed to conclude a series is stationary. The Zero Mean type is used for a 
time-series with random walk and without drift or trend. The Single Mean type is used 
for a series with random walk and drift but without trend. Trend is used when there is 


random walk and drift with trend. 


Trend is defined as “a regular, slowly evolving change in the series level” 
(Brocklebank, 2003). We can eliminate the Trend type test since none of the time-series 


in appendix C exhibits this characteristic. 


Drift is defined as “random variation about a non-zero level” (Yaffee, 
2000). We can eliminate the Single Mean type test since all the time-series exhibit 


variation about zero. 


The type of test needed to look for stationarity conditions is the Zero 
Mean type. Figure 5 clearly shows significant Rho and Tau probabilities for this type 
which indicates a stationary series. Appendix F contains similar Dickey-Fuller Unit Root 


Test results for all time-series considered in this thesis. 
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B. SAS TIME SERIES FORECASTING SYSTEM 

1. Overview Of SAS TSFS 

Internal to SAS version 8.1 is a windows-based program called the Time Series 
Forecasting System or TSFS. TSFS is a powerful program capable of automatically 
fitting models from forty-two different time series and identifying the best fit model in 
seconds. TSFS also provides statistics of fit and diagnostic tools, such as the 
correlogram, in a windows-based point and click environment. 

2. Considered Models 

SAS TSFS allows the user to select any number of the forty-two time-series 
models for analysis. However, previous observations of the data allow the dismissal of 


many of these models. 


All time-series plots in this thesis contain some negative values. This allows the 
removal of any natural log models for obvious reasons. An interpretation of the 
correlogram plots (autocorrelation and partialautocorrelation), combined with the 
Dickey-Fuller Unit Root test for stationarity allows elimination of complicated Auto- 
Regressive Integrated Moving Average (ARIMA) models. Simple ARIMA models, such 
as seasonal exponential smoothing which is an ARIMA (0,1,1), were left for TSFS to 
consider. The ten models that remain viable for further analysis are: 


e Seasonal Exponential Smoothing 

e Winters Method-Additive 

e Mean 

Simple Exponential Smoothing 

Double (Brown) Exponential Smoothing 
Linear Trend 

Linear (Holt) Exponential Smoothing 
Dampened Trend Exponential Smoothing 
Seasonal Dummy 

Linear Trend with Seasonal Terms 


C; EVALUATING MODEL FIT 

There are many different error measures available in TSFS to evaluate and 
identify which time-series model fits best. Selection of an appropriate error measure is 
more art than science. There is no single ‘best’ measure. Two commonly used time- 


series error measures were considered for adoption in this thesis: Mean Square Error and 
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Akaike Information Criterion. The formulas for MSE and AIC is given in Equations 3.1 
and 3.2 respectively. 


MSE = 25E- (3.1) 
T-1 


AIC = —2LOG(max likelihood) + 2k (3.2) 


where SSE = Sum of Squared Error 
T = number of observations 


k =number of parameters estimated 


1 Be Mean Square Error 

Mean Square Error (MSE), shown in Equation 3.1, is one of the most commonly 
used measures in statistics and time-series analysis. It measures the average prediction 
error. There are two drawbacks to using MSE or sum of squares (SSE) as evaluation 
measures. Considered jointly, these drawbacks resulted in dismissing the idea of using 


MSE as the model selection criterion. 


First, MSE and SSE severely penalize outliers in the data. Nearly all of the 
captain time-series plots exhibit large spikes, or outliers. Second, using MSE as the 
selection criterion places no penalty on the number of parameters used in the model 
formulation. A model evaluation measure that avoids these problems is beneficial. 

2. Akaike Information Criterion 

Because of the issues with using MSE, a “more sophisticated model-selection 
statistic is generally preferred (Chatfield, 2001)”. Akaike’s Information Criterion (AIC), 
shown in Equation 3-2, is a commonly used evaluation statistic that avoids the pitfalls of 


MSE. 


AIC is less sensitive to outlying data than MSE and considers the number of 
parameters in the model. AIC balances precision of fit against the number of parameters 
included in the model (Brocklebank, 2003). AIC selects the best fitting model, “as 
measured by the likelihood function, subject to a penalty term that increases with the 
number of parameters fitted in the model” (Chatfield, 2001). This penalty, which 


increases as parameters are added to the model, prevents over-fitting. As a result, AIC 
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will choose the best-fitting model with the minimum number of parameter estimates. The 


best model will have the smallest AIC. 
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IV. ANALYSIS OF RESULTS 


A. OVERVIEW 

This chapter gives the results obtained from the SAS Time-Series Forecasting 
System (TSFS). Section B presents two models, from among the ten considered, that 
TSES clearly identified as having the best “fit” using AIC as the selection criterion. The 
three best-fit models identified by TSFS are presented in sections C and D for captains 
and majors respectively. Section E contains a summary of results and discussion of why 
the AIC scores are so similar for seasonal exponential smoothing and the Winters 


additive method. 


It is easy to get confused when looking at the computations behind time-series 
analysis. Notation is usually the root of this confusion. To assist the reader, a summary 
of the notation used in the time-series calculations can be found in Figure 7. 

x, observed loss value at time ¢ 

X,,, predicted loss value s periods from time t 
L, Level value at time ¢ 

T, Trend value at time ¢ 

S, Seasonal value at time ¢ 

a Level smoothing constant 

f£ Trend smoothing constant 


y Season smoothing constant 


Figure 7. Notation 


B. FORECAST METHODS 

Two of the ten viable time-series models presented earlier stood out from the rest. 
Those two models are seasonal exponential smoothing and Winters method-additive. In 
every considered time-series, these two models had the best and second-best fits. 

is Seasonal Exponential Smoothing 

The seasonal exponential smoothing model was selected by TSFS as having the 
best fit in thirty-one of thirty-four instances. In the three series in which seasonal 


exponential smoothing did not have the best fit, it had the second-best fit. Level and 
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seasonality smoothing equations are used in seasonal exponential smoothing to generate 
predictions about future observations. Alpha (@) and gamma (vy) are numeric 


smoothing constants with values between zero and one chosen in an optimal way by 


TSFS. Alpha and gamma were both determined to be .001 by TSFS. 


As seen in equation 4.1, the level term, L,4;, is calculated in three steps. First, the 
previous seasonal value (S;.s;) is subtracted from the current observation (x,) and 
multiplied by the numeric constant alpha. Second, (1—q@) is multiplied by the current 
level value (L,). Finally, the results from steps one and two are added to determine the 


value of Ly47. 


Lh, ,=a(tx, -S)+d--a)L (4.1) 
Sia i, G4 = Lost a =) y) Daas s (4.2) 
ee = Laas a S, +l-s (4.3) 


The seasonal term in equation 4.2, S;4;, 1s calculated after the actual value of x;+; 
is known. It too is calculated in three steps. First, the previously calculated L,+; is 
subtracted from x;,; and multiplied by gamma. Second, (1 — y ) is multiplied by the 
seasonal value for this month from one year ago (S;4/-s). Finally, the results from steps 


one and two are added to determine the value of Sj4.;. 


Predictions from the seasonal exponential smoothing method are made using 
equation 4.3. A prediction for the next period is made by summing the previously 


determined level and seasonal values of L,.; and S;47-5. 


For example, suppose an estimate of next period’s Air Defense captain losses is 
desired using seasonal exponential smoothing. Appendix B shows that the data spans 
sixty months, so f is 60. x60, the current observed losses, is shown to be 2. The data is 


believed to have twelve month seasonality which means s equals twelve. 


The first step in predicting losses for x6; is to calculate L47 (L671). Sis (S6o-12) and 
L, (Loo) were determined to be .7684 and 2.078 respectively. Subtracting the previous 
seasonal value of .7684 from the current x69 observation of 2 and multiplying by the 


given alpha of .001 arrives at a solution of 2.074 for L,,7.( Lo). 


Ze 


A loss prediction for Air Defense captains in the sixty-first period can now be 
made. Adding the value for L;.; (L67) to the previously determined seasonal level S;47- 
(Soo+1-12) gives the predicted Air Defense captain losses for period X6; as -1.4. 

2. Winter’s Method-Additive 

Winters additive method is the current forecasting technique used by the strength 
analysis and forecasting branch of the Army G-1. Like seasonal exponential smoothing, 
it too was frequently selected by TSFS as having the best fit. Winters method-additive is 
generally referred to as the Holt-Winters method. Winters method-additive bears a 
striking similarity to seasonal exponential smoothing model. The difference between the 
two models is an additional parameter, trend, in the Winters-additive model. Level, 
trend, and seasonality smoothing equations are given in equations 4.4, 4.5, and 4.6 
respectively and are solved in a similar manner as the seasonal exponential smoothing 
equations. The recursive equation used to make predictions about future observations is 
in equation 4.7. 


bg = ONG Spor Tl = a) Qe Ty AAA) 


t Wen = B (Las >; L,) + ad ~ B) T, (4.5) 
Sit Sf a> eee BE a = a Sues (4.6) 
ae = Lia + Ta ws S, +1-5 (4.7) 


TSFS optimally produces the optimal values of @, #, and vy, and automatically 
calculates equations 4.1 through 4.7 making hand calculations unnecessary. For the 
analyst however, an understanding of what and how TSFS is calculating in the 


background is necessary when explaining forecasted results to decision makers. 


Cc; CAPTAINS’ RESULTS 

Table 4 shows the top three ‘best-fit? models for captains. The single most 
striking result is the dominance of seasonal exponential smoothing as the best fitting 
model in sixteen of the seventeen branches. Just as remarkable is the dominance of 


Winters-additive and the mean model as the second and third best fitting respectively. 
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Another not so apparent result shown in table 4 is the similarity of the AIC scores. 
In nearly all branches, the first and second choice models have AIC scores within one 
percent. Such similar AIC scores could result in indifference when selecting which of the 


two methods to use. 


Branch 1st Model Choice 2nd Model Choice 3rd Model Choice 
on ieee anes 


Seasonal Expo Smoothing Winters-Additive Mean 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing ; Winters-Additive : Mean 
Seasonal Expo Smoothing E Winters-Additive : Mean 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing f Winters-Additive s Mean 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing : Winters-Additive ; Mean 
Seasonal Expo Smoothing E Winters-Additive : Mean 


Seasonal Expo Smoothing : Winters-Additive : Mean 
Winters-Additive : Seasonal Expo Smoothing E Linear Trend Seasonal 
Seasonal Expo Smoothing : Winters-Additive : Mean 


Table 4. Captains’ AIC Result Summary 





D. MAJORS’ RESULTS 

Table 5 shows the top three ‘best-fit’? models for majors. A close observation of 
AIC results leads to a similar conclusion of being indifferent between the first- and 
second-choice models. Seasonal exponential smoothing is the best-fitting model for 
fifteen of the seventeen branches with Winters-additive model chosen as the best for the 
remaining two branches. The opposite is true for the second-choice model with Winters- 
additive being preferred twelve times and seasonal exponential smoothing being 


preferred five times. 
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Branch 1st Model Choice 2nd Model Choice 3rd Model Choice 
pit a et 


Seasonal Expo Smoothing Winters-Additive Linear Trend 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing : Winters-Additive . Mean 
Seasonal Expo Smoothing : Winters-Additive ‘ Mean 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing E Winters-Additive 4 Mean 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Winters-Additive : Seasonal Expo Smoothing . Linear Trend 
Seasonal Expo Smoothing : Winters-Additive : Mean 
Seasonal Expo Smoothing : Winters-Additive : Linear Trend 
Seasonal Expo Smoothi : Winters-Additive : Mean 
Seasonal Expo Smoothi : Winters-Additive . Mean 
Seasonal Expo Smoothi : Winters-Additive : Mean 
Seasonal Expo Smoothi : Winters-Additive : Linear Trend 
Winters-Additive : Seasonal Expo Smoothing : Linear Trend 
Seasonal Expo Smoothi : Winters-Additive : Mean 
Seasonal Expo Smoothi : Winters-Additive ; Mean 


Table 5. Majors’ AIC Result Summary 








E. PREDICTIVE POWER 

Seasonal exponential smoothing and Winters method-additive provide nearly 
indistinguishable ‘best’ fits for our training data. However, no claim can be made to their 
predictive power. Having the best AIC score tells us little about how well the model 
predicts. Comparing observed values in the test set against predicted values from the 


seasonal exponential smoothing model provides such insight. 


Figures 8 and 9 present correlation plots of predicted to observed values for 
captains and majors respectively. The solid diagonal line represents perfect correlation or 
a theoretical perfect prediction line. Data points on the line represent correct predictions. 


Data points off the line represent observations that were incorrectly predicted. 


The predicted net-losses of captains in 2004 is shown to be 21% correlated to the 
observed net-losses in the test set. Only one branch, MI, is greater than 50% correlated. 
This low correlation is partly related to unequally distributed large gain spikes that 
characterize most of the captain time-series plots. The correlation of each basic branch 


for captains is given in Table 7. 


The correlation of predicted to observed net-losses for majors is shown to be 52%. 
Eleven of the seventeen basic branches are greater than 50% correlated. The correlation 


of each basic branch for majors is given in Table 8. 
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Observed vs. Predicted Captains, 2004 





Observed 











Figure 8. 
AD 0.08 
AG 0.11 
AR 0.08 
AV 0.21 
CM 0.13 
Table 6. 





-60 -40 -20 0 20 40 
Predicted 


2004 Captain Correlation Plot 


EN 0.31 MP 0.27 SF 
FA 0.12 MS 0.33 TC 
Fl 0.14 OD 0.13 
IN 0.27 QM 0.35 
MI 0.55 SC 0.29 


2004 Captain Correlation by Basic Branch 


Observed vs. Predicted Majors, 2004 





Observed 














Figure 9. 


Predicted 


2004 Major Correlation Plot 
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0.11 
0.21 


AD 0.16 EN 0.67 MP 0.03 SF 0.63 


AG 0.75 FA 0.62 MS 0.53 TC 0.52 
AR 0.22 Fl 0.02 OD 0.61 
AV 0.65 IN 0.43 QM 0.34 
CM 0.67 MI 0.54 SC 0.66 


Table 7. 2004 Major Correlation by Basic Branch 


F. SUMMARY 

Ten time-series models were produced in TSFS for each of thirty-four 
combinations of basic branch and grade (captain and major). For each combination the 
ten models were evaluated using the AIC (Akaike Information Criterion) and the three 


best reported. 


The captains’ TSFS results are shown in Table 4. With the exception of special 
forces branch, seasonal exponential smoothing and Winters method-additive are the best 
and second best fitting models respectively. The opposite is true in special forces branch 
where Winters method is the best fitting and seasonal exponential smoothing the second 
best. In every series, AIC scores between the first and second choice models vary less 
than one-half a percent. AIC scores show that the fit of the top two fitting models is 


nearly indistinguishable. 


The majors’ TSFS results are shown in Table 5. The results are nearly the same 
as those found in the captains’. In every series, seasonal exponential smoothing and 
Winters method-additive are the top two fitting models. Just as in the captains’ results, 
the fit of these two models is nearly indistinguishable. The AIC scores between the first 


and second choice models vary less than one percent. 


These best fitting models have weak predicting power. Predictions generated 
from the seasonal exponential smoothing model for 2004 were compared to the 
corresponding observed values in our test set. The observed and predicted values for 
captains have a correlation of .21; for majors the correlation is .52. Correlation results for 


each basic branch are shown in Tables 6 and 7. 
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Vv. CONCLUSION 


A. SUMMARY 

Accurate prediction of officer loss behavior benefits decision and policy makers 
alike. An inaccurate prediction of officer strength affects the number of personnel 
authorizations, the Army’s budget, and the necessary number of accessions. Imbalances 
of branches and career fields affect the Army’s combat readiness as a whole. An accurate 
picture of officer loss behavior is especially sought for officers in the grades of O-3 and 


O-4. 


Captain and major losses, by basic branch, from October 1998 to September 2004 
were aggregated into time-series. Ten different time-series forecasting techniques were 
applied to each of thirty-four series through SAS TSFS to identify which models fit best. 
The forecasting techniques applied were seasonal exponential smoothing, Winters 
method-additive, mean, simple exponential smoothing, double exponential smoothing, 
linear trend, linear (Holt) exponential smoothing, dampened trend exponential smoothing, 


seasonal dummy and linear trend with seasonal term. 


Akaike’s Information Criterion was used to evaluate the fit of each of the ten 
models. Two models, seasonal exponential smoothing and Winters method-additive, 
distinguished themselves from the others. These two models had the best fits in every 


series. 


However, these best fitting models have week predicting power. Predictions from 
the seasonal exponential smoothing model were compared to the observed values in our 
test set. The predictions for captain net-losses are 21% correlated and majors 52% 
correlated. 

B. OVERALL CONCLUSIONS 

There is no universal ‘best-fit’ forecasting technique. Seasonal exponential 
smoothing and Winters-method additive are proven to be the best fitting models for the 
TAPDB-AO data. Using Akaike’s Information Criterion (AIC) as a selection statistic, 
seasonal exponential smoothing and Winters method-additive are shown to have 


indistinguishable fits. 
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Winters method-additive, the current technique used by the Strength Analysis and 
Forecasting Branch, Army G-1, is validated. Although seasonal exponential smoothing 1s 
less complex, having one less parameter, the increase in fit as measured by AIC is 


negligible. 


Officers who utilize time-series techniques to make net-loss predictions are 
cautioned from expecting too much. The best prediction model was only 52% correlated 
to the observed data. Time-series techniques should continue to be applied despite these 
weak correlations until another model is proven better. 

C; RECOMMENDATIONS FOR FURTHER STUDY 

A comparison of the results of multiple regression and time-series is worth 
investigating. Such a study would require the collection of external monthly econometric 
variables such as gross domestic product, unemployment rate, durable good orders, etc. 


Multiple regression may achieve better fitting models than the time-series shown here. 
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SPD Code 


BDK 
BHK 
BNC 
BRA 
BRB 
BRC 
DFS 
FCA 
FCB 
FDF 
FDL 
FFW 
FHC 
FHG 
FND 
JCC 
JDK 
JDL 
JDN 
JFG 
JFL 


JFM 
JFP 
JFR 
JFW 
JGB 
JHF 
JHK 
JJD 
JKB 
JNC 
JND 
JRA 
JRB 
JRC 
KCA 
KCB 
KCC 
KCM 
KCQ 
KDK 
KFF 
KHK 
KNC 
KND 


APPENDIX A 


Narrative Reason 

Military Personnel Security Program 
Substandard Performance 

Unacceptable Conduct 

Homosexual Act 

Homosexual Admission 

Homosexual Marriage (or Attempt) 

In Lieu of Trial by Court-Martial 

Early Release Program-Voluntary Separation Incentive 
Early Release Program-Special Separation Benefit 
Pregnancy or childbirth 

Ecclesiastical Endorsement 

Failed Medical/Physical Procurement Standards 
Immediate Enlistment or Reenlistment 
Dismissal, No Review 

Miscellaneous/General Reasons 

Reduction in Force 

Military Personnel Security Program 
Ecclesiastical Endorsement 

Lack of Jurisdiction 

Competent Authority, Without Board Action 
Disability, Severance Pay 


Disability, Existed Prior to Service, Physical Evaluation Board 


(PEB) 

Disability, Not in Line of Duty 

Disability, Other 

Failed Medical/Physical Procurement Standards 
Non-Selection, Permanent Promotion 

Failure to Complete Course of Instruction 
Substandard Performance 

Court Martial 

Misconduct 

Unacceptable Conduct 

Miscellaneous/General Reasons 

Homosexual Act 

Homosexual Admission 

Homosexual Marriage (or Attempt) 

Early Release Program-Voluntary Separation Incentive 
Early Release Program-Special Separation Benefit 
Reduction in Force 

Conscientious Objector 

Surviving Family Member 

Military Personnel Security Program 
Secretarial Authority 

Substandard Performance 

Unacceptable Conduct 


Miscellaneous/General Reasons 
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LOSS CODES 


AR 635-5-1 Category 
Resignation 


Resignation 
Resignation 
Resignation 
Resignation 
Resignation 
Resignation 
Resignation 
Resignation 
Resignation 
Resignation 
Resignation 
Resignation 
Resignation 
Resignation 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 


Involuntary discharge 


Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Involuntary discharge 
Voluntary discharge 
Voluntary discharge 
Voluntary discharge 
Voluntary discharge 
Voluntary discharge 
Voluntary discharge 
Voluntary discharge 
Voluntary discharge 
Voluntary discharge 
Voluntary discharge 


LBB 
LBC 
LBK 
LCC 
LFH 
LGB 
LGC 
LGH 
LHH 


LND 
MBK 
MBM 
MCA 
MCB 
MCC 
MCE 
MDB 
MDF 
MFF 
MGJ 
MGP 
MGU 
MHC 
MND 
PKB 
PKF 
RBD 
RBE 
RCC 
RDL 
RHK 
RNC 
SBB 
SBC 
SBE 
Scc 
SFJ 
SFK 
SGB 
SHK 
SNC 
VBK 
WEI 
WEK 
WEQ 
YDN 


Maximum Age 

Maximum Service or time in Grade 
Completion of Required Active Service 
Reduction in Force 

Failure to Accept Regular Appointment 
Non-Selection, Permanent Promotion 
Non-Selection, Temporary Promotion 
Non-Retention on Active Duty 


Dismissal, Awaiting Appellate Review 





Miscellaneous/General Reasons 
Completion of Required Active Service 
Insufficient Retainability (Economic Reasons) 
Early Release Program-Voluntary Separation Incentive 
Early Release Program-Special Separation Benefit 
Reduction in Force 

To Attend School 

Hardship 

Pregnancy or Childbirth 

Secretarial Authority 

Request for Extension of Service Denied 
Interdepartmental Transfer 

Enrollment in a Service Academy 
Immediate Enlistment or Reenlistment 
Miscellaneous/General Reasons 
Misconduct 

Misconduct 

Sufficient Service for Retirement 
Voluntary Early Retirement 

Reduction in Force 

Ecclesiastical Endorsement 

Substandard Performance 

Unacceptable Conduct 

Maximum Age 

Maximum Service or Time in Grade 
Involuntary Early Retirement 

Reduction in Force 

Disability, Permanent 

Disability, Temporary 

Non-Selection, Permanent Promotion 
Substandard Performance 

Unacceptable Conduct 

Completion of Required Active Service 
Disability, Permanent 

Disability, Temporary 

Disability, Aggravation 


Lack of Jurisdiction 
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Involuntary release from active duty 
(REFRAD) or transfer 
Involuntary release from active duty 
(REFRAD) or transfer 
Involuntary release from active duty 
(REFRAD) or transfer 
Involuntary release from active duty 
(REFRAD) or transfer 
Involuntary release from active duty 
(REFRAD) or transfer 
Involuntary release from active duty 
(REFRAD) or transfer 
Involuntary release from active duty 
(REFRAD) or transfer 
Involuntary release from active duty 
(REFRAD) or transfer 
Involuntary release from active duty 
(REFRAD) or transfer 
Involuntary release from active duty 
(REFRAD) or transfer 


Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Voluntary REFRAD or transfer 
Dropped from the rolls of the Army 
Dropped from the rolls of the Army 
Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 

Retirement 


Release from military control 


944 Death 
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A. 


Date 
10/1/1998 
11/1/1998 
12/1/1998 

1/1/1999 

2/1/1999 

3/1/1999 

4/1/1999 

5/1/1999 

6/1/1999 

7/1/1999 

8/1/1999 

9/1/1999 
10/1/1999 
11/1/1999 
12/1/1999 

1/1/2000 

2/1/2000 

3/1/2000 

4/1/2000 

5/1/2000 

6/1/2000 

7/1/2000 

8/1/2000 

9/1/2000 
10/1/2000 
11/1/2000 
12/1/2000 

1/1/2001 

2/1/2001 

3/1/2001 

4/1/2001 

5/1/2001 

6/1/2001 

7/1/2001 

8/1/2001 

9/1/2001 
10/1/2001 
11/1/2001 
12/1/2001 

1/1/2002 

2/1/2002 

3/1/2002 

4/1/2002 

5/1/2002 

6/1/2002 

7/1/2002 

8/1/2002 

9/1/2002 
10/1/2002 
11/1/2002 
12/1/2002 

1/1/2003 

2/1/2003 

3/1/2003 

4/1/2003 

5/1/2003 

6/1/2003 

7/1/2003 

8/1/2003 

9/1/2003 
10/1/2003 
11/1/2003 
12/1/2003 

1/1/2004 

2/1/2004 

3/1/2004 

4/1/2004 

5/1/2004 

6/1/2004 

7/1/2004 

8/1/2004 

9/1/2004 


APPENDIX B 


AGGREGATED LOSS TABLES 


AGGREAGED CAPTAINS’ LOSSES 
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AGGREGATED MAJORS’ LOSSES 
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A. 


Losses 


APPENDIX C LOSS GRAPHS 


CAPTAIN LOSS GRAPHS BY BRANCH 
AD Captain Losses 


SEP S38 JANSS MAYSS SEPSS JANOO MAYOO SEPOO JANO! MAYO! SEPO! JANO2 MAYO2 SEP O2 JANOS MAYOS SEPOS 
dete 


AG Captain Losses 


SEP 938 JANSS MNAYSS SEPSS JANOO MAYOO SEPOO JANO! MMYO! SEPO! JANO2 MAYO? SEP O2 JANOS MAYOS SEPOS 
date 


Losses 


Losses 


AV Captain Losses 


100 


-100 
-200 
SEPS8 JANSS NAWVSS SEPS9S JANOO NMAYOO SEPOO JANO! MAYO! SEPO! JANO2 NAYO2 SEPO2 JANOS NMAYOS SEPO3 
date 
CM Captain Losses 
20 
10 
i) 
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& =100 
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-300 


SEP38 JANIS NAYSS SEPSS JANOO MAYOO SEPOO JANO! MAYO! SEPO! JANO2 NAYO2 SEPO2 JANOS MAYO SEPO3 
date 
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20 
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=40 
SEP98 JANIS MAYSS SEPSIS JANGO MAYOO SEPOO JANO! MAYO! SEPO! JANO2 MAYO2 SEPO2 JANG3 MAYO3 SEPOS 
date 


IN Captain Losses 


100 


Losses 


=300 
SEP38 JANIS NAYSS SEPSS JANOO MAYOO SEPOO JANO! MAYO! SEPO! JANO2 NAYO2 SEPO2 JANOS NAYO3 SEPO3 
date 


40 


MI Captain Losses 


-200 


=300 
SEP98 JANIS MAYS SEPSIS JANOO MAYOO SEPOO JANO! MAYO! SEPO! JANO2 MAYO2 SEPO2 JANOS MAYO3 SEPOS 
date 


MP Captain Losses 
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SEP38 JANSS MAYSS SEPSS JANGO NAYOO SEPOO JANO! MAYO! SEPO! JANO2 MAYO2 SEPO2 JANO3 MAYOS SEPO3 
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Losses 


A WAH w/e 


SEP9S8 JANSS MAYSS SEPSS JANOO MAYOO SEPOO JANO! MAYO! SEPO! JANO2 MAYO2 SEPO2 JANO3 MAYO3 SEPO3 
date 


OD Captain Losses 


SEP38 JANSS MAYSS SEPSS JANGO NAYOO SEPOO JANO! MAYO! SEPO! JANO2 MAYO2 SEPO2 JANO3 MAYOS SEPO3 
date 
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Losses 


Losses 


QM Captain Losses 
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SF Captain Losses 


20 


10 


Losses 
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TC Captain Losses 
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date 
B. MAJOR LOSS GRAPHS BY BRANCH 
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Losses 


Losses 


AR Major Losses 


20 


-30 
SEP98 JANSS MAYSS SEPSS JANOO MAYOO SEPOO JANOI MAYO! SEPO1 JANO2 MAYO2 SEPO2 JANOS MAYO3 SEPO3 


date 


AV Major Losses 


20 


-10 


-20 
SEP98 JANSS MAYSS SEPS9 JANOO MAYOO SEPOO JANOI MAYOI SEPO1 JANO2 MAYO2 SEPO2 JANOS MAYO3 SEPO3 


date 


46 


Losses 


Losses 


CM Major Losses 
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Losses 
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APPENDIX D FORECAST GRAPHS 


A. CAPTAINS’ FORECAST GRAPHS BY BRANCH 
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B. MAJORS’ FORECAST GRAPHS BY BRANCH 
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APPENDIX E DICKEY-FULLER TEST 


A. DICKEY-FULLER SAS CODE FOR MAJORS’ DATA 


OPTIONS PAGENO = 1 LINESIZE = 74 PAGESIZE = 64; 
title 'Dickey Fuller Test for Stationary Series'; 





























data Majors; 

INFILE 'C:\Documents and Settings\CptMajData\MajNoNames.txt' DLM='09'X 

DSD MISSOVER; 

INPUT 

AD AG AR AV CM EN FA FI IN MI MP MS 
OD QM sc SF ‘EC; 

date=intnx('month', '010CT1998'd,_ n_-1); /*creation of date variable*/ 

format date monyy5.; 














PROC ARIMA DATA = MAJORS; identify var = AD STATIONARITY=(adf£=(0)); 
/*adf=0, output is not a regression on the immediately previous 
output*/ 

title 'AD Majors'; 
PROC ARIMA DATA = MAJ 
title 'AG Majors'; 
PROC ARIMA DATA = MAJ 
title 'AR Majors'; 
PROC ARIMA DATA = MAJ 
title 'AV Majors'; 
PROC ARIMA DATA = MAJ 
title 'CM Majors'; 
PROC ARIMA DATA = MAJ 
title 'EN Majors'; 
PROC ARIMA DATA = MAJ 
title 'FA Majors'; 
PROC ARIMA DATA = MAJ 
title 'FI Majors'; 
PROC ARIMA DATA = MAJ 
title 'IN Majors'; 
PROC ARIMA DATA = MAJ 
title 'MI Majors'; 
PROC ARIMA DATA = MAJ 
title 'MP Majors'; 
PROC ARIMA DATA = MAJ 
title 'MS Majors'; 
PROC ARIMA DATA = MAJ 
title 'OD Majors'; 
PROC ARIMA DATA = MAJ 
title 'QM Majors'; 
PROC ARIMA DATA = MAJ 
title 'SC Majors'; 
PROC ARIMA DATA = MAJ 
title 'SF Majors'; 
PROC ARIMA DATA = MAJ 
title 'TC Majors'; 


run; 
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identify var = AG STATIONARITY=(adf=(0)); 


cw 
Dp Ww 





identify var = AR STATIONARITY=(adf=(0)); 


Da Ww 
nsn 
Seon Ae 


identify var = AV STATIONARITY=(adf=(0)); 


c 


RS; identify var = CM STATIONARITY= (adf=(0)); 





RS; identify var = EN STATIONARITY=(adf=(0)); 














RS; identify var = FA STATIONARITY=(adf=(0)); 





RS; identify var = FI STATIONARITY=(adf=(0)); 
oF identify var = IN STATIONARITY=(adf=(0)); 


RS; identify var = MI STATIONARITY=(adf=(0)); 





RS; identify var = MP STATIONARITY=(adf=(0)); 





RS; identify var = MS STATIONARITY= (adf=(0)); 





RS; identify var = OD STATIONARITY= (adf=(0)); 
RS; identify var = QM STATIONARITY= (adf=(0)); 


RS; identify var = SC STATIONARITY= (adf=(0)); 





RS; identify var = SF STATIONARITY= (adf=(0)); 
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RS; identify var = TC STATIONARITY=(adf=(0)); 
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B. DICKEY-FULLER SAS CODE FOR CAPTAINS’ DATA 


OPTIONS PAGENO = 1 LINESIZE = 74 PAGESIZE = 64; 
title 'Dickey Fuller Test for Stationary Series'; 





























data Majors; 

INFILE 'C:\Documents and Settings\CptMajData\CaptNoNames.txt' DLM='09'X 

DSD MISSOVER; 

INPUT 

AD AG AR AV CM EN FA FI IN MI MP MS 
OD QM sc SF LC; 

date=intnx('month', '010CT1998'd, n -1); /*creation of date variable*/ 

format date monyyd5.; 














PROC ARIMA DATA = CAPTAINS; identify var = AD STATIONARITY=(adf=(0)); 
/*adf=0, output is not a regression on the immediately previous 
output*/ 

title 'AD Majors'; run; 

PROC ARIMA DATA = CAPTAINS; identify var = AG STATIONARITY=(adf=(0)); 
title 'AG CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = AR STATIONARITY=(adf=(0)); 
title 'AR CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = AV STATIONARITY=(adf=(0)); 
title 'AV CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = CM STATIONARITY=(adf=(0)); 
title 'CM CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = EN STATIONARITY=(adf=(0)); 
title 'EN CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = FA STATIONARITY=(adf=(0)); 
title 'FA CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = FI STATIONARITY=(adf=(0)); 
title 'FI CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = IN STATIONARITY=(adf=(0)); 
title 'IN CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = MI STATIONARITY=(adf=(0)); 
title 'MI CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = MP STATIONARITY=(adf=(0)); 
title 'MP CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = MS STATIONARITY=(adf=(0)); 
title 'MS CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = OD STATIONARITY=(adf=(0)); 
title 'OD CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = QM STATIONARITY=(adf=(0)); 
title 'QM CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = SC STATIONARITY=(adf=(0)); 
title "SC CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = SF STATIONARITY=(adf=(0)); 
title 'SF CAPTAINS'; run; 
PROC ARIMA DATA = CAPTAINS; identify var = TC STATIONARITY=(adf=(0)); 
title 'TC CAPTAINS'; 

run; 
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DICKEY FULLER UNIT ROOT TEST RESULTS 


APPENDIX F 


TEST RESULTS FOR MAJORS BY BRANCH 
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TEST RESULTS FOR CAPTAINS BY BRANCH 
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Dickey-Fuller Unit Root Tests 
Rho Pr < Rho Tau Pr < Tau 
8.6754 <.0001 8.96 <.0001 
8.7384 0.0005 8. 89 0.0001 
9.3366 0.0001 8.87 <. 0001 
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